Zero Assumption Recovery (ZAR) version 6.3 FAT16/FAT32 recovery (ZARFAT) USER'S MANUAL Copyright (C) Alexey V. Gubin, 1999-2002 *** ACKNOWLEDGEMENTS *** Thanks to Alexey Ermoshkin for numerous ideas and for his great patience during beta tests. *** PURPOSE *** ZARFAT is a READ-ONLY data recovery program. It can be used in cases where the drive was FDISKed, formatted, hit by virus or corrupted some other way. Supported filesystems are * FAT16 (used by Windows 95/NT) and * FAT32 (used by Windows 95, 98, ME, 2000 and XP). * Long file names (LFN) are supported. *** SYSTEM REQUIREMENTS *** * 386 or better processor * 4Mb memory + 2Mb memory per gigabyte of volume to recover * Additional disk device to store recovered data (preferably another HDD) * MS-DOS operating system or MS Windows 9x in DOS mode * HIMEM.SYS driver installed *** DATA RECOVERY PROCESS *** It is recommended that you print this manual so you can refer to it during the recovery process. Volume reconstruction basically consists of the following stages: 1. Determining the area to be recovered and configuring options 2. Pattern scanning of that area 3. Reconstruction of disk parameters based on pattern scan results 4. Reconstruction of FAT tables (if possible) 5. Reconstruction of directory tree 6. Recovering selected directories to another medium 7. Renaming recovered files to their corresponding long names. *** STAGE 0 - SETTING UP RECOVERY PROCESS *** ZAR cannot be used from within any multitasking environment (including Windows), so you must reboot from floppy disk or restart Windows in MS-DOS mode. 0.1 - CHARACTER SET SPECIFICATION In order to improve recovery quality you must specify a character set your operating system uses in file names. Default values for English-only systems are hardcoded into a program. Additional characters (Russian for example) should be written to file "CHARSET.DAT" and it should be put into the same directory as ZAR executable files. WARNING: if you skip this step and do not specify the character set, files and directories with names containing local (non-English) characters will be considered invalid and WILL NOT BE RECOVERED. For English-only system, you should rename CHARSET.ENG (empty file) to CHARSET.DAT For Russian-enabled system, you should use CHARSET.RUS To create your custom character set, just create a text file containing the characters of choice. Then save it as CHARSET.DAT. This file will be loaded and characters it contains will be added to default (English) character set. Formatting of the text file does not matter, because CR/LF newline, space and tab characters are removed when the file is loaded. You cannot remove or modify the default (built-in) character set. This is by design. 0.2 - LOG FILE LOCATION Once you start ZAR, you will be asked about log file location. Please note that log file can grow as large as 10 megabytes in size (for large disks) and it is written to quite often. So if you want logging, you should put logfile on a large and fast medium (I recommend using the same medium you plan to recover data to). You can press ENTER to accept default location (in a ZAR.EXE directory), enter NUL to disable logging or enter your custom log file name. *** STAGE 1 - DETERMINING THE AREA TO BE RECOVERED *** First of all you are prompted to select a physical disk you want to recover. ZAR displays a list of compatible disks found, showing their parameters (including capacity). Highlight the desired drive and hit "Enter". Two additional options exists here: 1. You can load the disk image from file. For more information about disk images, see appendix A. 2. You can choose to load previously saved scan results. You will be asked about a file name later, because save file is only valid for a disk the scan was run on. After requesting a save file, you should select a disk (or image file) matching the save file. Once the disk is selected, you will be asked about the save file name. Program checks the disk size stored in a save file against the size of a selected disk (or image) and displays a warning if it detects a mismatch. If the physical disk is selected, program will perform some simple hardware diagnostics and try to read partition table to determine volume layout. If partition table is (at least partially) correct, the list of available volumes will be displayed with the following information: 1. Partition type ("Type") This can be either PRI for primary partition or EXT for logical drive in extended partition. This field is only provided for reference, and you can ignore it if you are not familiar with these FDISK terms. 2. Filesystem type for the partition ("OS / Filesystem type") This is a file system type as indicated by a partition table, and it will be FAT16 or FAT32 in most cases. 3. Active partition flag ("Active") This is "Yes" when the volume is used for OS startup. Otherwise "No" is displayed. Only a single partition (on physical disk) can be active, and it must be primary ("PRI" partition type shown). 4. Start offset, in megabytes ("Start at, Mb") This is an offset of the first partition sector (form the start of disk) in megabytes. In most cases it should be equal to the sum of sizes for all previous volumes. 5. Volume size, in megabytes ("Vol. Size, Mb") This is a volume size as indicated by partition table 6. Boot sector signature status ("Boot Sig") Shows if the volume boot sector looks correct. Can be either "Good" or "Bad" A number of tests is performed in attempt to check partition table for consistency. Should these tests fail, a warning will be displayed stating that partition table is untrustworthy and describing problems found. Possible causes include: 1. Too much space left unallocated (this can be a false alarm). 2. Volumes that are sized greater than a physical disk can hold. 3. Some volumes overlapping each other. 4. Some volumes having bogus records (such as end sector before start sector) but still can be recognized. 5. Some partition table records damaged beyond recognition. Case 1.1 - PARTITION TABLE EXISTS AND IS CORRECT This is a case when no messages about bad signatures are shown and volume layout shown on the "Select partition to recover" screen is correct. In this case you should simply select the volume you want to recover from the list displayed. Case 1.2 - PARTITION TABLE IS DAMAGED OR CONTAINS INCORRECT DATA Systems with missing (e.g. overwritten) partition table exhibit the following symptoms a. Volumes, which are known to be on the disk, are not shown when operating system starts b. ZAR reports that "Partition table sector 0 signature is bad" and/or "Partition table seems to be damaged" c. ZAR shows no volumes on a "Select partition to recover" screen Systems with (partially) corrupt partition table exhibit the following symptoms a. Some volumes are not shown when operating system starts b. ZAR reports that "Partition table sector N signature is bad" and N is not zero c. ZAR shows incomplete or incorrect information about volumes You may also want to specify volume layout manually after accidental FDISKing the drive. In those cases partition table is valid but actually contains wrong information. If the volume you want to recover is either missing or displayed incorrectly, you should manually select area for recovery. This can be done by entering values for start sector and size of the volume. This values are accepted in sectors or in megabytes (which are automatically recalculated to sectors). Volume start offset (first sector) is usually a sum of sizes for all volumes preceding the volume in question. Assume as the example that there was a following partition layout: C: - 5 Gb volume (system startup) D: - 5 Gb volume E: - 10 Gb volume giving 20 Gb of total hard disk capacity Corresponding start offsets will be 0 Mb for volume C:, 5000 Mb for D: and 10000 Mb for E:. It is recommended to subtract 10..100 Mb (to avoid calculation errors, such as a possible confusion between decimal and binary megabytes), adding the same values to the volume size. Taking the above into account, the following values should be used for this example: C: - 0 Mb offset, 5000 Mb size D: - 4900 Mb offset, 5100 Mb size E: - 9900 Mb offset, 10100 Mb size. WARNING: Should you specify incorrect area to search in, the entire recovery will fail *** INTERFACE TYPE SELECTION *** Starting with version 4.0.0 ZAR supports two operation modes, namely 1. Simple mode In a simple mode all options are set to default values (which were deveploped to fit a general usage patterns) and ZAR makes all decisions about filesystem structures automatically (instead of asking for your confirmation). Generally, this is a recommended mode of operation. If you are using Simple Mode, scan results will be automatically saved upon scan completition (to a file named AUTOXXXX.SAV in a current directory). If you plan to use Simple Mode, continue reading starting from section 6 - "Recovering files". 2. Advanced mode Advanced mode allows you to configure recovery options and influence upon a recovery decisions. This options are usually useful for experts only. *** CONFIGURING RECOVERY OPTIONS *** Once the disk area to scan has been determined, you'd be presented with a list of the following options. In a Simple Mode, default values are automatically accepted. 1. "Recover long file names". "Yes" or "No", "Yes" by default. This toggles long file name information recording (see below for details of LFN reconstruction process). You might want to disable this option if you run out of memory. 2. "Ignore case in long file names". "Yes" or "No", "No" by default. This option only controls a recovery behaviour for LFNs witch fit into 8.3 format but have a specified letter case (for example "TeSt.LfN"). With this option enabled ("Yes"), such a long names will not be recovered (case will be converted to all-capitals). The above example will be converted to "TEST.LFN". With this option set to "No", the exact recovery of a long name will be attempted. 3. "Recover erased files". "Yes" or "No", "Yes" by default. This option controls recovery of files erased prior to a disk crash. Not of a much use, except for mass undeletions after virus attack. 4. "LFN filtering mode". "Strict" or "Loose", "Loose" by default. Controls correctness checking on long file names. MUST BE set to "Loose" if you want to recover erased files (see #2) and their corresponding LFNs. In all other cases "Strict" is recommended. 5. "Skip files when both FATs are bad". "Yes" or "No", "No" by default. With this option set to "Yes", file will be skipped (not recovered) when both FAT entries are incorrect. The default behaviour is to attempt recovery (but the results are usually not good with fragmented volumes). 6. "Allow N invalid symbols in names". Number from 0 to 11, 0 by default. Controls how many invalid characters in a file/directory name are tolerated. If a directory name exceeds the threshold, the directory can still be recovered but it will be named "DIRxxx". If a file name is invalid, the file will not be recovered at all. All files and directory names discarded by this rule are logged. 7. "Skip files > X Mb, 0 - all files". Number from 0 to 2047, 100 by default. With this option active files greater than X Mb in size will not be recovered. Value of 0 disables size checking. I consider the default 100 Mb limit to be acceptable (swapfiles and MPEG/AVI videos are common examples of what will be filtered out). 8. "Simulation mode (DEBUG)". "Yes" or "No", "No" by default. This option is intended primarily for debugging purposes. It SHOULD NOT be used during normal recovery run. If "Simulation" is set to "Yes" ZAR will create directories and files requested, but NO DATA WILL BE WRITTEN to files (they will be all of zero size). However, the logfile will be created. When options are configured, hit "Proceed" *** STAGE 2 - PATTERN SCANNING *** Pattern scanning is used to detect all recognizable pieces of data remaining on volume. It is always a tradeoff between gathering as much data as possible (to allow for successful recovery) and not to gather the infinite quantity (for higher analysis speed). Recognition for some types of disk areas is mandatory (these are system structures, namely Boot Record and Directories). You can disable recognition for others, but it is strongly recommended that you leave the default setting (All enabled) and select "Proceed". Program scans the area you selected during stage 1 and locates all data pieces it can recognize. This information is then used in analysis. *** STAGE 3 - DISK PARAMETERS RECONSTRUCTION *** Locations of the files on volume are expressed in number of clusters, while the same locations on the physical disk should be expressed in sectors. Number of sector is computed from the number of cluster by the following simple formula: Sector = CF * Cluster + SS where CF (Cluster Factor) is number of sectors per cluster and SS (Start Sector) is a sector number for cluster 0. When volume is damaged, values of CF and SS are usually lost or corrupt, so they are determined statistically based on pattern scan results. Automatic CF determination procedure simply tries all possible values from 1 to 512 and computes a number of errors for each value. The table usually looks as follows CF = 1 gives 0% errors CF = 2 gives 0% errors CF = 4 gives 0% errors CF = 8 gives 50% errors CF = 16 gives 75% errors and so on. The maximum CF value that gives less than 20% errors (which can arise from old data and from pattern scanning errors) is considered good. You can however enter a broader range of CFs to search, if you know that autodetected value is incorrect. For each CF value in a selected range, Start Sector (SS) is guessed. This includes enormous amount of computing effort and can take a long time (as long as 20 minutes on older machines). When it is finally done, you are to choose between several variants, which are displayed with their corresponding relevance values. In most cases you should select the first (default) variant (with maximum relevance). *** STAGE 4 - FAT RECONSTRUCTION *** During this step ZAR tries to identify where FATs are located on disk. You will be prompted about FAT type. If you know exactly that volume was using FAT16 or FAT32, specify it. Otherwise (or if you are not absolutely sure) select "Autodetect filesystem type". List of several (most relevant) variants will be shown with the following information: 1. FAT start sector and size ("StartSec/Size") These two parameters specify the location of the first FAT on the disk. 2. "FAT type" Filesystem type, can be either FAT16 or FAT32. Entries with FAT type not matching the expected type (if known) are junk. 3. Approximate volume size for this FAT size, in megabytes ("Volume size, Mb") Shows approximate size of the volume, computed from FAT size and cluster size. This can differ from actual volume size, but not more than by 1..2% of volume size. Entries showing invalid size (e.g. 40Mb for 2Gb volume) are junk. 4. FAT signature status ("Signature") This can be "Good", "Partial" or "Bad". "Good" means that correct signatures are found in both FAT copies, "Partial" - valid signature found only in one copy and "Bad" means that no signatures are valid. 5. Boot sector approval ("Boot") Can be either "Good" or "Bad", indicating if boot sector contains values matching this entry. This, if present, is most reliable validity indication, but if no approved entries exist it probably means that boot sector is damaged beyond recognition. 6. Match factor and relevance ("M.F./Rel.") These parameters determine how good the solution proposed is. You should usually select an entry with better relevance. There are two FAT copies on volume, and two entries are usually displayed (in cases where FATs are intact) One for the first copy: Start = ST Size = SZ "Both" signatures "Good" boot approval Relevance = R and another for the second copy: Start = ST+SZ Size = SZ "Partial" signatures "Bad" boot approval Relevance = R/2 First of these should be selected. If first FAT is corrupt, its relevance may be lower than expected. Generally if two variants are shown with matching volume sizes, identical FAT size and StartSec for second one is equal to (StartSec + FAT size) for first variant, then first variant should be selected. The list can be empty (if both FATs are damaged beyond recognition). In these cases you can either enter parameters manually (if you know them) or choose to discard FATs and continue recovery without FAT information. *** STAGE 5 - DIRECTORY TREE RECONSTRUCTION *** This stage is fully automatic and you cannot interfere with a process. During the reconstruction process the details are shown on screen, but you should merely ignore them (they are recorded into the log file as well). *** STAGE 6 - RECOVERING FILES *** When directory tree is refined, you are presented with a simple directory tree viewer. Use Up and Down arrow buttons to move through the list and Spacebar to select/deselect directory. The full list of the hotkeys available is provided in a bottom line of screen. "M" button allows you to specify a set of DOS-style file masks to include or exclude. Allowed wildcards are "*" (any number (may be zero) of any symbols) and "?" (any symbol, exactly one). The masks are processed as follows: 1. All "Exclude" masks are checked. If any of them matches, the file is skipped. 2. All "Include" masks are checked. If any of them matches, the file is recovered. 3. If all "Exclude" and "Include" masks were checked but no decision can still be made, the file is skipped. The default values are set to: Exclude *.TMP and *.SWP Include all other files (*.*) When done, press "S" to start recovery. You will be prompted about the target location where recovered files will be stored. WARNING: Never copy files to the volume you are recovering, as it is likely to cause further damage! During the copy process the following information will be shown: * Total number of files requested. * Number of files copied * File name for the last copied file * Recovery method for the last copied file Recovery method can be one of these: * "FAT (both)" - means that both FAT chains for file were valid and used * "FAT (1st)" or "FAT (2nd)" - means that one FAT copy has invalid chain for this file, and the other copy provides information that seems to be correct. The correct one was used. * "DIRECT" means that there is no valid FAT chains for the file. Recovery is still attempted on the file, but it will likely to fail on a fragmented drive. *** STAGE 7 - RECOVERING LONG NAMES *** You must set "Recover long file names" option to "Yes" to recover LFNs (see stage 1). During file copy operation, long file name information is collected to file named LFNINFO.DAT. This file is stored in the directory you specified as a file copy destination. When copying is done, you should boot into Windows (any version will do, but it must support the language used on a crashed system), run FIXLFN.EXE and follow onscreen instructions. IMPORTANT: Do not modify the recovered data location (i.e. do not rename/move files) before you run FIXLFN! *** APPENDIX A: Working with disk image files *** Raw disk image is a sector-by-sector copy of either a physical disk or a logical volume stored in file. Raw images are limited to 2Gb in size (FAT filesystem limitation on a file size). Image formats other than raw do exist (for example, compressed and/or spanned over multiple files), but only raw images are currently supported by ZAR. Working with image is not much different than working with the actual disk. You should select "Use image in file" when prompted about a physical disk to repair, and type the full image file name. If an image of a physical disk is used, the subsequent actions are the same as for a physical disk. If an image of a logical volume is used, the partition information shown (if any) should be ignored and the whole image should be scanned ("There was only one volume on disk, so I want to scan entire disk"). This is because logical volume image does not contain any valid partition information but the boot signature is valid, which can lead to a misinterpretation.